Skip to content

[Cambricon] Optimize 8 ops with larger BLOCK_SIZE kernels#2218

Open
Ankaluoer wants to merge 2 commits intoflagos-ai:masterfrom
Ankaluoer:cambricon-opt
Open

[Cambricon] Optimize 8 ops with larger BLOCK_SIZE kernels#2218
Ankaluoer wants to merge 2 commits intoflagos-ai:masterfrom
Ankaluoer:cambricon-opt

Conversation

@Ankaluoer
Copy link
Copy Markdown

@Ankaluoer Ankaluoer commented Apr 2, 2026

PR Category

Operator

Type of Change

Performance Optimization

Description

Operator Avg Speedup (Before) Avg Speedup (After) Improvement
abs_ 0.735 0.967 +31.6%
neg_ 0.685 0.953 +39.1%
ceil_ 0.797 0.872 +9.4%
relu_ 0.755 0.919 +21.7%
threshold (new) 0.520 0.889 +71.2%
dropout 0.562 0.589 +4.9%
logical_and_ 0.595 0.867 +45.9%
logical_or_ 0.595 0.820 +37.7%

Issue

Progress

  • Change is properly reviewed (1 reviewer required, 2 recommended).
  • Change is responded to an issue.
  • Change is fully covered by a UT.

Performance

All benchmarks run on MLU590-M9DE with FlagGems standard benchmark suite.

@huangyiqun huangyiqun self-assigned this Apr 3, 2026
@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Copy link
Copy Markdown
Collaborator

@huangyiqun huangyiqun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please sign in CLA.

tl.store(OUT_ptr + offsets, tl.where(x > 0, x, 0), mask=mask)


# backward 保留 pointwise_dynamic 不动
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please change the Chinese comments here to English?

tl.store(OUT_ptr + offsets, result, mask=mask)


# backward 保留 pointwise_dynamic
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please change the Chinese comments here to English?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants